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ABSTRACT 

We discuss the possibility of performing blind surveys to detect large-scale features 
of the universe using 21cm emission. Using instruments with ~ 5' — 10' resolution 
currently in the planning stage, it should be possible to detect virialized galaxy clusters 
at intermediate redshifts using the combined emission from their constituent galaxies, 
as well as less overdense structures, such as proto-clusters and the 'cosmic web', at 
higher redshifts. Using semi-analytic methods we compute the number of virialized 
objects and those at turnaround which might be detected by such surveys. We find 
a surprisingly large number of objects might be detected even using small (~ 5%) 
bandwidths and elaborate on some issues pertinent to optimising the design of the 
instrument and the survey strategy. The main uncertainty is the fraction of neutral 
gas relative to the total dark matter within the object. We discuss this issue in the 
context of the observations which are currently available. 



1 INTRODUCTION 

Emission and absorption of electromagnetic radiation due 
to transitions between th e hyperfine states of neutral Hy- 
drogen (Hi) iFieldlir958Tl has had a significant impact on 
our understanding of our galaxy and our immediate cos- 
mic neighbourhood. At present, however, the highest red- 
shift detection of 21cm emission is at z ~ 0.18 iZwann et alJ 
l200ll) and only very shallo w surveys (z < 0.04) of the whole 
sky have been performed l|Zwam^t^J^9^jK ilborn et alJ 
ll999l:lRvan- Weber et afll2002l lLang et al.ll2003D . In this ar- 
ticle we would like to point out that it might soon be pos- 
sible to perform blind unbiased searches for large objects, 
both virialized and collapsing, using 21cm emission as their 
tracer. 

It has long been speculated that it might be possi- 
ble to detect the onset o f the epoch of reionization using 
redsh ifted 21cm emission iScott fc Reeslll99Ci iMadau et alJ 
Il997t) and indeed, spurred on by the claim that the reion- 
ization epoch might have been at much higher re dshifts 
(: sa 17) than previously thought llSpergel etaL|2 00o|). there 
has been much recent work on this subje ctllFuT koTettoet alJ 
l2003l:IZaldarriga et all2003tlGnedin fc Shaveill2003l) . More- 
over, it has also been pointed out that the evolution of the 
spin-temperature, Ts, prior to reionization might be such 
that the large-scale structure can be observed in absorp- 
tion against the cosmic microwave background (CMB) at 
very low frequencies (~ 30 — 50 MHz) enabling accurate de- 
termi nation of a variety of impo rtant cosmological param- 
eters jLoeb fc Zaldarriagalliooa) . Our motivation is some- 
what different: to map the large-scale structure (LSS) once 
it has developed sufficiently for the spin temperature to be 



given by the kinetic temperature of the gas ~ 1000K and 
Ts 3> Tqmb, that is, using 21cm emission. 

For various proposed instruments, our aim is to make 
conservative estimates of the number of virialized clusters 
that might be found in the range z < 1, and also collapsing 
objects with lower overdensities at 2 > 1. These higher red- 
shift objects are the proto-clusters which would have formed 
virialized clusters by the present day and in some sense char- 
acterize the so called 'cosmic web' of LSS observed in cos- 
mological N-body simulations. They are likely to be HI rich, 
but naively, virialized objects would not necessarily be the 
most obvious places to look for neutral gas since the process 
of virialization and the creation of the intracluster medium 
involves violent gravitational processes which are likely to 
strip neutral gas from the c onstituent galaxies b y tidal in- 
teractions and ram pressure iGunn fc Gottlll972l) . However, 
by virtue of their large overdensity such objects should still 
contain a substantial neutral component, probably larger at 
high redshifts than locally, albeit locked up in galaxies. The 
fiducial detection threshold that we will discuss in this pa- 
per will be « 10 n M Q of HI, which we will show could be 
possible on large patches of the sky; it is worth pointing out 
that this would only require a cluster to contain 20 gas-rich 
spirals with HI masses ~ 5x 10 9 Mq. This is likely to be the 
case for many clusters. 

Most importantly, in the context of this discussion, all 
the objects we will consider, both virialized and collapsing, 
will have angular diameters ~ 5' — 10' which will match 
the resolution of the telescopes and arrays with ~ 100m 
baselines currently being planned to detect redshifted 21cm 
emission. This will make them ideal survey targets since they 
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would fill the beam, in contrast to ordinary galaxies at the 
same redshift whose signal will be substantially diluted and 
will most likely be below the confusion limit. 

Blind surveys for galaxies and galaxy clusters have been 
performed in the optical waveband for many years. The 
most recent galaxy redshift surveys (2dFGRS and SDSS) 
have yielded significant constraints on cosmological param- 
eters via measurements of the power spectrum and red- 
shift space distortions, as well as a wealth of understanding 
of the properties of the individual galaxies <lPeacockll2003t 
iTegmark et al .120031) . If deep redshift surveys using HI could 
be performed they would open up a new window on the 
universe which could have some technical advantages over 
optical approaches. In particular, using the flexible spectral 
resolution of radio receivers, objects can be selected using 
spectroscopic methods rather than, for example, photomet- 
rically from optical plates, making them flux limited over 
the specific redshift range probed. The structure of the se- 
lection function should, therefore, be simple to define. It 
should also be possible to deduce the redshift and accurate 
line-widths for individual objects allowing one to probe the 
underlying mass distribution directly and establish the pre- 
cise selection criteria a posteriori. This could overcome some 
of the issues of bias inherent in optical surveys. Moreover, 
potential confusion with other spectral lines is unlikely in 
the radio waveband since there a very few strong lines. 

A standard line of argument often put forward is that, 
the optical surveys are sensitive to the sum of all the 
starlight from the galaxies and could be significantly bi- 
ased by the non-linearities involved in the star-formation 
process, while the neutral component is of primordial ori- 
gin and hence is unbiased. This is unlikely to be completely 
true since the neutral component is also involved in these 
processes. However, observing the HI component is more 
than just another view of the same object. The HI halos 
of nearby galaxies extend out very much further than their 
optical counter parts, suggesting that the dynamics of the 
HI contains extra information. 

As with galaxies, the study of clusters has focused on 
the central region where diffuse hot gas is situated. This 
emits very strongly in the X-ray waveband and can also lead 
to fluctuations in the CMB via the Sunyaev-Zeldovich (SZ) 
effect. However, typically the core radius of the cluster is 
only about 1/I0th of linear size of the object and l/1000th of 
the volume. It is likely that the study of these objects using 
HI as the tracer will lead to new insights in an analogous way 
to galaxies. In particular many of the HI rich spiral galaxies 
are likely to be situated away from the core and there is also 
a possibility of a component of diffuse HI similar to that 
seen around individual galaxies. 

At present radio telescopes which are sufficiently sensi- 
tive to map the sky quickly enough to be competitive with 
optical surveys do not exist. However, the next few years 
could see the start of development of instruments which 
are many times more powerful in terms of survey speed 
than those presently available. The Square Kilometre Ar- 
ray (SKA) (see www.skatelescope.org for details) is the 
ultimate goal of much technology development currently un- 
derway. Although its precise design is still evolving, its pro- 
jected specification involves a collecting area of ~ 10 6 m 2 , 
sub-arcsecond resolution, a large instantaneous field of view 
and significant bandwidth in the region where redshifted HI 



will be detected. Such an instrument will have the ability to 
perform significant galaxy redshift surveys tracing HI over 
a wide range of redshifts. 

Proto-types are being planned which would be around 
1/ 100th of the full SKA in area. Searches for galaxies at low 
redshift, extending those currently being performed, would 
be an obvious possibility for such an instrument. Our aim, 
here, is to discuss issues related to the design of such an 
instrument in the context of a blind survey for large ob- 
jects at intermediate and high redshifts such as virialized 
galaxy clusters and collapsing proto-clusters which are, as 
we have already discussed the ideal targets for these sur- 
veys in terms of their size; our conclusion being that the 
planned instruments will be sufficiently sensitive to perform 
significant surveys. 

First, we will discuss different proposed concepts for 
such instruments and perform a simple sensitivity calcula- 
tion for the limiting HI mass (Mgf) of a survey assuming a 
fixed integration time 1 . We will then discuss the possibility 
of detecting virialized clusters. In order to make the link with 
theoretical predictions of the number of dark matter halos, 
we need to make some assumption as to the HI content of 
individual dark matter halos. This is a very complicated is- 
sue and since we are, at this stage, only trying to make rough 
estimate of the number of clusters one might find, we will as- 
sume that there exists a universal ratio /hi(M, z) = Mni/M 
which relates the dark matter mass of a halo, M, to its neu- 
tral gas component at a given redshift. This should be a good 
approximation in the mass range relevant to clusters, but is 
unlikely to be so of smaller masses. We will make various ar- 
guments in order to estimate this fraction. At each stage we 
will attempt to make conservative assumptions and there- 
fore our estimate of the number of objects found in a given 
survey should be a lower bound. Under these assumptions 
we can then deduce the number of objects that a survey 
might find using the results of N-body simulations. We then 
discuss some issues related to the design of the instrument, 
the survey strategy and confusion due to HI in the field. Fi- 
nally, we will adapt our calculations to compute the number 
of objects at turnaround and discuss issues relating to their 
detection 

Throughout this paper, we will assume a cosmological 
model which is flat, with the densities of matter and the cos- 
mological constant relative to critical given by fi m = 0.3 and 
SIa = 0.7. The Hubble constant used is 72kmsec _1 Mpc _1 
and the power spectrum normalization is assumed to be 
erg = 0.9. These values are compatible with a rang e of ob- 
servations (ISpereel et alJl2003t IContaldi et al.ll200STl . 



2 PROPOSED INSTRUMENTS 

A number of different concepts are under discussion which 
could make observations in the frequency range 400 MHz < 
/obs < 1200 MHz. The proposed instruments will, in fact, 
be sensitive to a much wider range of frequencies (up to 
1700 MHz), but this is the most sensible range to discuss 

1 We note that all integration times discussed in this paper are 
'on-source' integration times and do not take into account any 
practical difficulties required to make the observations. These 
could vary significantly for different experimental configurations. 
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the detection of large objects since they will be close to, 
or completely, unresolved. The design features which are of 
interest to us here are the overall collecting area, A, the aper- 
ture efficiency, r), the instantaneous bandwidth, A/j nat , the 
instantaneous field-of-view (FOV), fiinst, the longest base- 
line, d, and the system noise temperature of the receivers, 
T*yg. The overall system temperature, Tsys, at frequency / 
can be conservatively estimated by taking into account the 
CMB and galactic background 
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in a cold part of the sky. 

We will focus on two different concepts which differ in 
the way they form the instantaneous field of view. Firstly, 
we will consider a conventional interferometer similar to the 
Allen Telescope Array (ATA), which comprises rid dishes of 
diameter D. In this case fimst oc (A/L>) 2 and A = Tvn d D 2 /A 
where A oc l// b s - The current design of the ATA has 
rid ~ 350, D = 6.1m and it is likely that r\ ~ 0.6, d ~ 400m 
and A/j na t //obs ~ 0.05. The FOV is set by the size of the 
individual telescopes and the synthesized beam size is set 
by the longest baseline. Such a configuration would have a 
synthesized beam with #fwhm = 3.2' at / b s = 1000MHz 
and a corresponding instantaneous FOV of 13.4 deg 2 . Note 
that throughout we assume a circular aperture, that is, 
Afwhm = 1.22A/d, which should be a reasonable approx- 
imation. 

The other competing concept which we will consider is 
that of a phased array which could potentially have a much 
larger field of view; the idea being to use very low cost an- 
tennae, arranged in n p patches of size P, and form rib indi- 
vidual beams using off-the-shelf computer hardware. Hence, 
Slinst oc rib(A/d) 2 ^ (A/P) 2 and A = 7rn p P 2 /4 assuming 
that the patches are circular, from which we can deduce 
that d ^ y/rTbP. Such a concept could have more than one 
instantaneous FOV, possibly allowing for more continuous 
usage for HI studies. Three possible scenarios for such an 
instrument are presented in table Q These range from con- 
servative, setup I, to speculative, setup III. We should note 
that for P ss 4m, none of the above configurations come close 
to saturating the inequality on P. The value of rib should be 
taken as being a guide figure since extra computing power 
could increase this close to the maximum FOV set by P. 

The standard formula jRobertslll975l) relating the HI 
mass, Mm, and the observed flux density, S, integrated over 
a velocity width, Av, in an FRW universe is given by 



Mm 
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where dz,(z) is the luminosity distance to redshift z. For a 
fiducial velocity width of Av « 800 km sec -1 assumed at this 
stage to be independent of the mass, an object containing 
10 n A/ Q of HI will have Swo « 4, 14, 42, 150, 840 AiJy at 
/ obs = 400, 600, 800, 1000, 1200 MHz corresponding to z = 
2.55, 1.37, 0.78, 0.42, 0.18 respectively. 

Assuming that the clusters are unresolved, which will 
be a good approximation at least for the phased array con- 
figurations at intermediate and high redshift (see fig. Q for 
virialized objects), then the rate at which one can survey 
the sky over the instantaneous bandwidth is quantified by 
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Table 1. Three possible sets of design parameters for a phased ar- 
ray. Included also are Spwhm and f2i nst at / b s = 1000 MHz, and 
the figure of merit M relative to configuration described in the 
text. In each case we will assume that jj m 0.8. This value, which 
is slightly larger than that conventionally assumed for parabolic 
dishes, should be possible for these arrays since they are much 
closer to the ground. 




Figure 1. The angular diameter size (twice the virial radius) of 
virialized objects with M vil = 10 15 M Q , 1O 14 M and 1O 13 M 
(solid lines, top to bottom respectively). Included also is the es- 
timated beam size for the three different phased array configura- 
tions (dashed lines, I top, II middle, III bottom). Only the closest 
and largest objects will be resolved by the instruments discussed 
in this paper. 
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where A/ bj is the frequency interval corresponding to Av. 
The noise, S a , attainable for a survey with angular co verage , 
Af2, in integration time, t, is given by S a = IZgoo \J Afl/t. 
This line of argument should also yield important informa- 
tion even when the object is resolved. Projected values for 
7?-800 are presented in table |5] for the ATA and the dif- 
ferent phased array setups using / D b s = 400 — 1200 MHz 
along with the integration time required to achieve a 5a 
detection (5 8 oo > 5S CT ) of M$? w lO n M of HI in 
Av w 800kmsec _1 on lOOdeg 2 . Note that in reality the 
angular coverage of a given survey is likely to be an inte- 
ger multiple of the instantaneous FOV. However, this way 
of presenting the sensitivity makes it possible to compare 
experiments with different FOVs in a coherent way. 
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Table 2. Approximate survey sensitivity 7?-800 for / b s = 400 — 
1200 MHz in units of mjy sec 1 / 2 deg -1 for the ATA and the 
phase array setups I, II and III. Also presented is the integration 
time 4goo in days required to achieve a 5<r detection threshold of 
M m = 10llM © on 100 deg 2 and Av = 800 km sec" 1 . 
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Figure 2. The number of dark matter halos with mass greater 
than M rmv i r per deg 2 per 1% bandwidth. The dotted line is for 
/obs = 1200 MHz, the solid line is for 1000 MHz, the short dashed 
line for 800 MHz, the dot-dashed line for 600 MHz and the long 
dashed line for 400 MHz. Note that there is between 1 and 0.1 
halo with M rmvir as 10 14 Af e for / obs > 600 MHz. 



3 DETECTING GALAXY CLUSTERS 

3.1 HI content of galaxy clusters 

As is almost always the case, the observed quantity Mm 
is not theoretically well understood. It is possible to make 
accurate predictions as to the clustering of dark matter, 
but the evolution of the gas is much more difficult to pre- 
dict. Therefore, we will introduce the empirical quantity 
fm(M,z) which we will define to be fraction of the HI 
in a dark matter halo of mass M at redshift z. We antic- 
ipate that this quantity is likely have a substantive scatter 
°hi = (/hi) — /hd but we shall assume this is zero since 
estimating it would be futile due to the small number of the 
observations available at present, and any scatter is likely 
to increase the number of objects that are found in a given 
survey assuming that the mean value is correct. This is be- 
cause there are typically many more objects just below the 
mass limit than above it. We will first present a simple order 
of magnitude estimate for /hi and then go on to discuss the 
current status of observations. 

One can estimate the fraction of HI in clusters, /h i *j 
from that in a typical galaxy, by taking into account differ- 
ent mass to luminosity ratios, M/L, of typical clusters and 
galaxies, and the fact that only spiral galaxies are HI rich. 
If / a pi ra i is the fraction of spiral galaxies in a typical cluster 
then 
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HI 
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(M/L) 6 
(M/L) 
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By considering the ave rage properties of galax- 
ies jRoberts fc Havnedli"99l) . the value of /§ T is observed 
to be ~ 0.06 for a typical spiral galaxy at z ~ 0. The value 
M/L of such object is likely to be ~ 8, whereas that for 
a typical cluster is ~ 240. It is believed that the value of 
/spiral ~ 0.5, making f$? st ~ 10 -3 . 

Another approach is to compute the value of /hi for 



a cluster by summing up the measured values of Mm for 
the constituent galaxies and computing their line of sight 
velocity dispersion, a, and hence their virial mass M v ir = 
3na 2 Rn/(2G), where Ru is the mean harmonic radius and 
G is the gravitational constant. We should note that this is 
likely to under-estimate /hi since it ignores the many gas- 
rich dwarf galaxies which are below the detection threshold 
of any survey. Moreover, it also ignores, as does the order of 
magnitude estimate above, the possibility that there exists 
a diffuse component of HI such as the tidal plu mes of HI 
observed in galaxy groups llAppleton et aljll98ll) . 

The HI content of the Virgo gal axy cluster is the best 
studied (|Hutchmeier fc Richt,erJll986h . but it is well known 
that this object is not completely relaxed; it being comprised 
of three interacting sub-clusters, and so the computed value 
of /hi is likely to be an unreliable statistical indicator of 
the global value. Studies of the HI content of other clusters 
are much less detailed. The Abell clust er A3128 has been 

all l200lft . By co-adding the 



surveyed for HI (fChen galur et 
HI at the position of galaxies brighter than an r-magnitude 
of 16.2, a total HI mass of 1.3 x lO n M was detected. M v i r 
for this cluster is w 1.5 x 10 14 M© estimated from the velocity 
dispersion and the angular distribution of the galaxies at 
an assumed distance of 240Mpc. For the detected galaxies 
Mm /M vl r ~ 0.9 x 10 -3 , compatible with our earlier order of 
magnitude estimate. A recent sensiti ve multi-beam stu dy of 
Centaurus A group has been made feanks et alJll999l) and 
this survey was able to detect HI in galaxies over a wide 
range of luminosities and, therefore, includes at least part 
of the dwarf galaxy contribution. For this group Mm/M v i r ~ 
1.1 x 10~ 3 . 

Many different arguments suggest that /hi should in- 
crease with redshift. HI is the basic fuel for star formation 
and it is believed that for an individual object M* oc Mm. 
Since the global star formation rate is known to grow ap- 
proximately linearly with redshift to a maximum about five 
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Figure 3. The same quantity as plotted in fig. 121 but this time 
against / obs for M vir = 3 X 10 12 M Q (long-dashed line), 1O 13 M 
(dotted line), 3xl0 13 M© (short-dashed line), 1O 14 M (solid line) 
and 3 X 1O 14 M (dot-dashed line). 



times the local value at z ~ 2 — 3, it seems likely that 
/g'(M,z ~ 1) « 5 x f H f(M,z ~ 0); an effect which may 
be even stronger within clusters. A related, but logically 
distinct fact is the observation that high redshift clusters 
contain a lar ger fraction of blue galax ies when compared 
to local ones Butcher fc Oemledll984h and also that they 
cont ain a larger fraction of spirals and a reduced number of 
SOs IIDressler et allll9 97) . Both these observed facts suggest 
that / S pi ra i also increases with z. Hence, using the estimate 
it seems sensible to deduce that using the locally ob- 
served fni St is likely to be an under-estimate by a factor of 
a few. 

Various estimates of Qui appear in the literature and 
these can be used to gain some further insight into the is- 
sues under consideration here. By making observations of 
damped Lyman-a systems in an average redshift ran ge of 
(z) « 2.4 the CORALS project fellison et all 1200 ll) esti- 
mate that Qui ~ 2.6 x 10 -3 . Using the fiducial value of 



Q m , one can deduce a value of /hi 



8 x 10~ 3 for the 



field at this epoch corresponding to a density of pni ~ 
3.6 x lO 8 M Mpc -3 . Using the recently published data from 
the HIPASS survey it has been deduced iZwann et all2003T) 
that for z < 0.04 the value of Qui is much lower, Qm ~ 
3.8 x 10~ 4 from which one can deduce that /hi ~ 1.2 x 10~ 3 
and phi ~ 5.3 x lO 7 M Mpc -3 . Remarkably this is approx- 
imately a factor 6-7 smaller than the estimated value at 
z = 2.4 and is clearly compatible with our earlier remarks 
concerning the global star formation rate. Moreover, the es- 
timated value at z ~ is very close to the value we have 
discussed in the context of clusters. This would tend to sug- 
gest that the value of /hi is only marginally lower in clusters 
(say 20 — 30%) than in the field. Since the clusters are a sub- 
stantial (~ 100) overdensity in the dark matter they should 
be clearly observable in HI with an appropriate sized tele- 
scope. 



3.2 Number of virialized dark matter halos 

Since we are only trying to compute a lower bound on the 
number of objects found in a given survey, it is reasonable 
to assume a value of /hi ~ 10 -3 constant over the range 
of masses we are considering and independent of redshift. If 
Mg™ « lO n M as discussed earlier for the various planned 
surveys, then Mii m « 1O 14 M for the dark matter and hence 
by computing the number of objects greater than this limit 
will give an estimate of the number of objects likely to be 
found in a survey. 

Now, let us compute the number of objects with mass 
greater than some limiting mass Mii m between z — Az/2 and 
z + Az/2 in a solid angle AQ, 



N(M > M lim ) = AzAQ 



where 



dV 



dn 



dzdQ l M dM 



dM . 



Az=(l + z)dv = (1 + z)A/ inst // obs : 



(5) 
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is assumed to be small, dV/(dzdQ) is the comoving volume 
element and dn/dM is the comoving number density of ob- 
jects. 

We will use an expression for the comoving number den- 
sity which is derived from n umerical simulation s carried out 
by the VIRGO consortium (lEvrard et al.ll2002l) 



<p{-|^,m)| 3 ' 86 }, (7) 



dM (Z ' M) - °- 22 Ma M dM 

where A(z,M) — 0.73 — \og[D(z)aM], Pmifio) is the matter 
density at the present day, <tm is the overdensity in a viri- 
alized region of mass M and D(z) is the normalized growth 
factor. Here, the mass is defined to be M200 , that inside a re- 
gion with an approximately spherical overdensity A c = 200. 
This can be related to the virial mass, M v j r if we assume a n 
NFW function for the cluster profile jNavarro et aljll997l) . 
In all the subsequent discussion we have made this correc- 
tion using a concentration parameter of c = 5, in which case 
that from M200 to Mvir is typically very small. 

In Fig. we plot N(M > M Um ) per deg 2 per 1% 
bandwidth against M v i r for five different redshifts (z=2.55, 
1.37, 0.78, 0.42, 0.18) corresponding to frequencies / Q b s = 
400 MHz, 600 MHz, 800 MHz, 1000 MHz and 1200 MHz re- 
spectively. We see that there are ~ 0.5 dark matter ha- 
los with Mvir « 1O 14 M per deg 2 per 1% bandwidth for 
600 MHz < / obs < 1200 MHz. There are very few objects 
with M vir > 2 x 1O 13 M for / obs = 400 MHz. Larger ob- 
jects with M v i r ~ 3 x 1O 14 M are much more rare for 
/obs = 600 MHz, but for higher frequencies there would still 
be many objects above this limiting mass. Using the fidu- 
cial value of M Um = 1O 14 M at / obs = 1000 MHz, by re- 
ferring back to table [5] we see that it would be possible to 
cover 10000 deg 2 in w 400 days using setup II. Such a sur- 
vey would find w 15000 objects above this mass limit since 
there are w 0.3 objects per deg 2 per 1% bandwidth. We note 
that a similar survey would take only 90 days with setup 
III and there would be twice as many objects found since 
it has a larger bandwidth. For the same integration time 
one could cover only about 1000 deg 2 at ,/ obs = 800 MHz 
and would find w 1000 objects with with setup II. At 
/obs = 1200 MHz one would be able to cover more area, 
but there are less objects objects since the comoving vol- 
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Figure 4. The power law n such that (N/ldeg 2 ) oc ACT™. Un- 
der the assumption that /hi is independent of M v i r the optimal 
strategy is when n = 5/3. The lines are labelled as in fig. 121 

ume is decreasing, and things are likely to be somewhat 
more complicated since lO 14 Af0 objects would be resolved 
at this frequency. Much less area could be covered to the 
same depth at / Q b s = 600 MHz and as we will discuss it 
might be that such objects are difficult to detect against the 
background since the beamsize increases with decreasing fre- 
quency. Clearly, there is no point in searching for virialized 
objects at / obs = 400 MHz. 

We have also computed the same quantity as a func- 
tion of /obs in fig.Olfor different values of Mu m . We see that 
there are very few virialized objects accessible to observa- 
tions with / bs < 600 MHz again suggesting that surveys 
which are intending to search for such objects are unlikely 
to find anything significant for z > f .4. However, we also see 
that for the fiducial value of Mu m ~ 1O 14 M0 which we have 
been using, there are a wide range of frequencies for which 
one will find more than one object per 10 deg 2 in a 1% band- 
width. Many smaller objects (M vir « 3 x lO 12 A/ - 10 13 M G ) 
are accessible at lower frequencies due to the evolution of 
structure. 



3.3 Optimal design and survey strategy 

One can attempt to gain some understanding of the optimal 
design of an instrument and survey strategy by substituting 
into for Afi from for Az from jBJ, and using S ff Av oc 
M H i/d 2 L . One finds that TV oc feM(M HI /Av) where 

77 2 ^ 2 n ins tA/ in st , a s 
M oc -! — 2 , (8) 

-Isys 

can be thought of as the instrumental figure of merit which 
clearly has sensible dependences on the relevant parame- 
ters. This effectively quantifies how many objects one would 
find in a survey ignoring any potential systematics. Alterna- 
tively, the integration time required to find a fixed number 
of objects is oc M" 1 . ft is clear that similar arguments can 
be made for a galaxy redshift survey and this would, there- 
fore, also apply to the SKA and other similar instruments. 



This formula should also yield important information when 
applied to surveys for other kinds of objects. 

The value of M ~ 27 for the ATA configuration dis- 
cussed earlier and its value is presented in table for the 
phased arrays. This figure of merit has been normalized to 
the theor etical sensitivity of the Parkes multi-beam (PM) 
receiver (iBarnes et al.ll200lfl . although we note that this 
instrument can only observe in a narrow frequency range 
around 21cm. We have also computed this quantity for the 
Lovell Telescope multi-beam receiver and the Green Bank 
Telescope (single beam); they are A4 ~ 0.34 and 0.21 re- 
spectively. Note that we have used the frequency indepen- 
dent Tgyg to compute this quantity. 

We see that the setup I is about a factor of 5 more 
powerful than the PM, while both the presented ATA con- 
figuration and setup II improve on this substantially. Setup 
III is more than 500 times more powerful than the PM. 
For a phased array configuration fiinst oc n\,d~ 2 and hence 
M oc r)Y = A/d 2 which represents the filling factor of the 
array. It is clear that, at least for unresolved detections and 
ignoring the issue of confusion, which we will discuss below, 
a totally filled array = 1) will perform the best in terms 
of having the largest number of objects above the detec- 
tion threshold. This is because an array with rjp <C 1 has 
a much lower sensitivity to surface brightness temperature, 
than one with r]F ~ 1. Such an array would be likely to have 
poor resolution particularly at low frequencies and would, 
therefore, make it difficult to make any sub-selection within 
the sample to cut down on systematics. It is possible that 
low resolution arrays could also suffer from confusion related 
issues (see below). 

1/3 

Assuming a virialized halo Av oc a oc MV r ° (a is the ve- 
locity dispersion of the object) and if /hi is independent of 
M v i r , then the multiplicative factor Af HI /Av oc Af^/ 3 . Since 
the integral in © is a decaying function of Af v i r with nega- 
tive power law n = n(M v i r ), the optimal observing strategy 
would be to set the noise so that Afn m is that for which 
n = 5/3, that is M Um w 2 x 10 14 M e , 1O 14 M , 4 x 10 13 M Q , 
1.5 x 10 13 A/ Q and 10 12 A/ Q for / obs = 1200MHz, 1000 MHz, 
800 MHz, 600 MHz and 400 MHz respectively, this being a 
simple consequence of the evolution of structure. We have 
presented this power law n = — d(log N) /d(log Af ) as a func- 
tion of Af v i r in fig. 01 We see that under these assumptions 
Afn m « 10 14 Mq should be close to the optimal mass limit 
for /obs = 1000 MHz and that lower limits, requiring deeper 
surveys and hence less angular coverage for a fixed integra- 
tion time, are needed to be optimal at lower frequencies. 
We should caution that this is heavily predicated on the 
assumption that /hi is independent of M which we have al- 
ready argued is unlikely to be the case and this would have 
to be taken into account before relying on such a calculation 
to set the depth of an actual observational strategy. If /hi 
decreases with M then this would mean one should perform 
a deeper survey than one would if it were independent of M. 
Suffice to say, if one has some idea as to the dependence of 
/hi on M, the method would remain the same, but with a 
different value of n. 
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Figure 5. An attempt to estimate the confusion noise as function 
of the limiting Hi mass of a survey, M^™. The solid line is a 
representation of the thermal noise: when the other curves are 
below it the confusion noise is less than the contribution due to 
thermal noise. The dotted line is for z = 0.42 which corresponds 
to / b s = 1000 MHz with f m = 10~ 3 for all M. The equivalent 
curve for z = 1.37 (/ bs = 600 MHz) is the short dashed line. The 
dashed-dotted line (z = 0.42) and the long dashed line (z = 1.37) 
are those given by using /hi from 1101 . The values of Sfwhm are 
those used for setup II. 



3.4 Confusion noise 

Observations using spectral lines are not typically effected 
by issues of confusion as can often be the case for contin- 
uum sources. However, in the current situation we are deal- 
ing with very large amounts of HI and very large beams. 
One has, therefore, to be careful to avoid making the beam 
size so large that a typical beam contains an amount of HI 
comparable to the detection threshold. This becomes more 
and more important as one makes deeper and deeper sur- 
veys, particularly at high redshift. As we shall see this is 
very sensitive to the small scale distribution of HI, but in 
order to make a simple estimate let us note that the comov- 
ing volume enclosed by a beam of 10' and a velocity width 
of 800 km sec -1 at z ~ 1 is ~ 1000 Mpc 3 . Since the comov- 
ing matter density is « 4 x lO lo M0Mpc -3 , this means that 
a typical beam volume contains about 4 x 10 13 Mq of dark 
matter and using our estimate for /hi, about 4 x 10 10 Mq of 
HI. Clearly this mitigates our earlier assertion that r/F ~ 1 
would be the best situation for unresolved detections. Mak- 
ing #fwhm sufficiently small as to avoid this issue is clearly 
another important design criterion. 

An interferometer or phased array would not, in fact, 
be sensitive to this smooth mass distribution, but rather to 
the fluctuations in it, which are typically smaller. Therefore, 
the discussion above yields an over-estimate of the possible 
effects of confusion. The clusters have overdensities of ~ 100 
in the dark matter and we have already pointed out that, 
so far as current observations can tell, the value of /hi in 
the field is only marginally larger than that in a cluster, for 
example, in A3128. Even if the overdensity in HI were to 
be diluted to only ~ 20, one should not have any problem 
in picking them out from the background with a telescope 



whose beam is approximately the same size as the object. 
Velocity structure can only help in this respect. The problem 
is that for a given telescope the resolution degrades very 
rapidly as redshift increases. Hence, if a telescope were to 
be ideally suited to detection of clusters at z ~ 0.5 then by 
z ~ 1 this would lead to a dilution of the overdensity by 
factor of ~ 4. 

One can make an estimate of the rms fluctuation in the 
mass in a volume defined by the beam area and the velocity 
width Av = 800 km sec -1 by computing 



(M H i) = AzAQ 



dV 
dz dQ. 



[f H1 (M,z)] 2 M 2 -^dM, (9) 



where Az — (l + z)dv = (l+z)/obj//obs is the velocity width 
of the object and Afi oc dpwuM i s the beam area. We see 
that in order to estimate the confusion noise (Af H i) 1/ ' 2 one 
needs to know /hi {M, z) for all M at the particular redshift 
in question, that is, we need to extrapolate our argument for 
/hi down to the galactic scale and below, which is beyond 
the scope of this paper. We present the results for an exper- 
imental configuration similar to setup II, that is, z « 0.42 
(Afwhm ~ 5') and z w 1.37 (Afwhm ~ 8.4') using the fixed 
value of /hi ~ 10~ 3 in fig. along with some attempt to 
interpolate between /hi ~ 10~ 2 expected for lower mass 
objects and that which we have argued applies to larger ob- 



jects, /hi 
function 

/hi(M,z) 



10" 



We do this in an ad hoc way using the 



10" 3 (M/M Q ) + 10 1 



(10) 



(M/M Q ) + 10 14 ' 

to gain some insight into what the possible effects could be. 
We stress that we are not claiming that this expression has 
any physical origin, apart from the fact that it models the 
correct kind of behaviour. For M 3> 1O 1O M0 this function 
yields /hi « 10~ 3 and for M < 1O 1O M , it yields /hi = 
10 -2 . We note that in such a model the overall HI content 
of the universe is larger than in one with the fixed value of 

/m = nr 3 . 

We see that even when using I10H observations for setup 
II at / bs = 1000 MHz, one would have a confusion noise 
which is much lower than the thermal noise mass limit if 
M^ 1 > 2 x 10 10 A/ Q . Using the constant value of /hi = 10" 3 
the restriction is even weaker. Therefore, we can conclude 
that the confusion noise would be very much lower than the 
thermal noise value for the fiducial value of M H f = 10 n M Q 
used throughout this paper. As one might expect things are 
more restrictive at / b s = 600 MHz. For the fixed value of 
/hi one would be restricted to M H " > 3x 10 10 M Q and if JTTJ 
were to be true the confusion noise would be comparable to 
the thermal noise at M H ™ = 10 U M Q . One would be more 
likely to be effected by confusion noise for setup I due its 
large beam and less for setup III and the ATA since they 
have smaller beams. 



4 DETECTING PROTO-CLUSTERS 

4.1 Number of dark matter halos at turnaround 

The majority of our discussion to date has focused on the 
possibility of detecting virialized objects. However, we have 
noted that such objects are likely to have had their HI 
content depleted relative to the field by at least 20 — 30% 
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during the process of virialization. It should also be pos- 
sible to detect objects which are just beginning collapse, 
at turnaround, when they are just decoupling from the 
Hubble flow. This could be more efficient at high redshift 
where there are likely to be very few virialized objects with 
Mvir > IO 14 M0 and virialized objects could become con- 
fused if the beam is too large. At turnaround A c ~ 5 and by 
virtue of the fact that virialization has not yet taken place, 
it should be possible to use, with some confidence, the value 
of /hi for the field at that time. Assuming a linear rate of 
star formation between z — and z — 2.4, one can deduce 
that 

/£ ld (z)^(1.2 + 2.8z)xl0 -3 , (11) 
for the field. 

Moreover it is possible that the velocity dispersion of 
such objects is much less than the fiducial value of Av = 
800 km sec -1 used in the earlier parts of the paper for viri- 
alized objects. If this is so, the signal to noise for a fixed 
integration time on a source would increase. This is because 
the signal is oc (Av) -1 for a fixed HI mass and the noise 
is oc (Av) -1//2 . The typical value of Av for such a object 
is unknown. Formally, the point of turnaround is defined to 
be that when the average velocity is zero, but the velocity 
dispersion need not be so. It will be governed by that of 
the virialized objects within the region which are in the pro- 
cess of merging to build up the cluster. In the subsequent 
discussion we will allow for Av to be a modelling parame- 
ter, but it is worth discussing various possible values that it 
might take. If the object at turnaround just comprises of two 
large objects which are collapsing together then the value of 
Av will be close to the fiducial value of 800 km sec -1 . It is 
possible that such an object contains 10 Mq of dark mat- 
ter and is comprised of ~ 100 smaller galactic-scale objects 
in the process of infall with masses of ~ 10 12 Mq. In this 
case Av ~ 100 — 200 km sec -1 . Or it could be that there are 
many more smaller objects with, say, Av ~ 20—50 km sec -1 . 
These possibilities are worth remembering in the subsequent 
discussion. 

We should note that the collapsing objects at z w 
1 — 2 are the proto-clusters which would have virialized 
by z ~ 0.5 — 1. If i?ta, is the radius of the object at 
turnaround then the radius of that object once it has 
virialized is given by R v i T = assuming a com- 

plete l y matter-dominate d universe iPartridee fc Peebles! 
Il967l: iGunn fc Gotdll972Tb Corrections can be made to in- 
clude a cosmo logical constant |L ahav et al.||l99l| ) and also 
dark energy dWang fc Steinhardd Il998l : iBattve fc Welled 
2003). Similar relations can be derived relating the time 
at turnaround i ta and that at virialization t v [ r = 2tt a , 
and the corres p ondin g redshifts, 1 + Zt& = 2 2/ ^(l + z v jr)- 
iLiddle fc Lvtbl i2000l) give a detailed exposition of this 
model. 

The formula Q applies to objects with an overden- 
sity, Ac = 200. If one assumes an NFW profile function for 
the objects with c = 5, one can show that M200 ~ Ms/2; 
this relation is approximately true for a wide range of con- 
centration parameters. Taking the fiducial value of M^J™ = 
lO n M and /hi ~ 5 x 10 -3 for / obs = 600 MHz, we see 
that M" m « 2 x 1O 13 M and the corresponding value of 
M200 ~ 1O 13 M0 which can be used in conjunction with 
figs. |5| |21 and 2] to deduce the number of collapsing objects 
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z 

Figure 6. The angular diameter size (twice the radius) of ob- 
jects which are at turnaround with M5 = 1O 14 M0, 1O 13 M and 
1O 12 M (solid lines, top to bottom respectively). Included also 
is the estimated beam size for the three different phased array 
configurations (dashed lines, I top, II middle, III bottom). Note 
that the angular size of the collapsing objects are much larger 
than the objects of the same virial mass at the same redshift. 



that would be found in a given survey, if one ignores the 
smaller correction from M200 to M v j r . We see that there are 
w 20 objects of this size per deg 2 per 1% bandwidth. For 
/obs = 400 MHz the value of /hi is higher and hence there 
are about the same number of objects; the increase in /hi off- 
setting the smaller number of objects for a given M v i r . With 
this large number of objects and the large beams likely at 
this observing frequency, one might think that one would be 
close to confusion limited (~ 1 object per beam area), but 
the extra information provided by the velocity information 
should help avoid this possibility. 

Given the larger value of /hi likely for objects which 
have not virialized, it might be sensible to also consider the 
possibility of M^ 1 — 1O 12 A40, for which the corresponding 
values of A/| lm and M%qq are 10 times larger. We see that 
there are still numerous objects (~ 0.1 per deg 2 per 1% 
bandwidth) of this size for / b s = 600 MHz, although there 
are markedly less for lower values of / Q b s even taking into 
account the larger value of /hi- Achieving this larger limiting 
mass at 600 MHz should be possible in a fraction of the time 
required for 1O 11 M0. 

We should note that at any given redshift the objects 
at turnaround will be larger than those which are virialized, 
but have the same overall mass. Fig.|S]shows the angular di- 
ameter size of object with a variety of masses at turnaround. 
We see that these objects are larger than virialized objects 
of the same mass at the same redshift. At the high redshifts 
we are considering here, the objects still have a angular di- 
ameter less than 5' and hence they are likely to unresolved 
by the instruments under discussion. At lower redshifts such 
objects would be resolved and it should be more efficient to 
detect virialized objects. 
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4.2 Detection related issues 

We can adapt the earlier sensitivity calculation to an arbi- 
trary value of Av and Af H ™. In particular one can compute 
the TZav, Sav and *Av from 72.800, Ssoo and isoo- We see that 
the survey rate is given by 

1/2 

Rav = ftgoo ( 7""" I (12) 



Av 



For an HI mass of M H ™ the required flux density is 



Sav = 5*800 



/ 800 km; 



Av 



10 n M, 



(13) 



and the actual integration time required to achieve a 5a 
detection of such an object on an area 100 deg 2 is 

'1O 11 M X J 



tAv = isoo 



Av 



800 km sec - 



(14) 



If we assume that Av = 100 km sec 1 then we see that 
it would be possible to cover close to 260 deg 2 in a day of in- 
tegration Mh™ = 1O 12 M using setup II at / obs = 600 MHz. 
Since there are w 0.15 objects per deg 2 per 1% bandwidth 
above the corresponding mass limit then one would hope to 
find around 200 collapsing objects. It would take around 10 
times as long to do a similar survey at / Q b s = 400 MHz and 
one would find approximately 10 times fewer objects taking 
into account the larger value of /hi and the very much re- 
duced number of objects with a given mass. Nonetheless, 
~ 20 objects in 10 days of integration time is definitely 
worthwhile. 

Neither of the above survey parameters are optimal. If 
we assume that Av is either weakly dependent on M5 or not 
at all, then the optimal mass limit would correspond to the 
value of M200 for which n — 2 in contrast to the virialized 
case. Assuming that M200 ~ M v i t , we see from fig.[l]that the 
optimal values of M 20 o are 3 x 1O 13 M and 2 x 1O 12 M for 
/obs = 600 MHz and 400 MHz respectively. The correspond- 
ing HI mass limits are Af H ™ = 3 x 10 n M Q and 3 x 1O 1O M . 
For / b s = 600 MHz, it would require 4.3 days of integration 
time to achieve this optimum depth on 100 deg 2 . There are 
« 2.3 objects per deg 2 per 1% bandwidth and hence such a 
survey would find ~ 1150 objects (which is more than the 
~ 860 that would be found by mapping 1110 deg 2 in 4.3 
days with a limit of 10 12 M as suggested above) . 



5 CONCLUSIONS 

To summarise, we have shown that instruments likely to be 
built within the next few years have a realistic chance of 
detecting large objects, both virialized and collapsing, using 
HI emission as their tracer opening a new window on the 
universe. If a detection threshold of M H ™ w lO n M can be 
achieved at around z m 0.4 then it should be possible to find 
a surprizingly large number of virialised objects. Similarly, 
it should be possible to detect many objects at turnaround 
with z > 1. We have also made comments as to the optimal 
design of an instrument and the survey strategy for these 
applications. Clearly, more sophisticated simulations of the 
large-scale distribution of HI are required, but we believe 
that the basic picture we have put forward is likely to re- 
main. 



It is clear that the detection of the large number of ob- 
jects, both virialized and collapsing, predicted in this paper 
could have a significant impact on our understanding of the 
universe. In the regime where one can optimally detect viri- 
alized clusters [z < 1) it should be possible, using the extra 
velocity information, to accurately compute the dark matter 
mass of each of the objects which are detected and establish 
the selection function. Since the number of virialized objects 
is sensitive to cosmological parameters, accurate estimates 
of n m and as should be possible. Moreover, the properties of 
the dark energy may also be accessible to such an analysis. 
The nature of the collapsing structures for (z > 1) is also of 
significant interest. We have used all the available informa- 
tion to make estimates of the number of objects which would 
be found. However, we have also noted that these are some- 
what uncertain, particularly the velocity structure. Clearly 
the detection of a large number of objects will have a signif- 
icant impact on our understanding of the distribution of HI 
at high redshifts and the on-going process galaxy formation. 
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